Crowdsourcing, in which human intelligence and productivity is dynamically mobilized to tackle tasks too complex for automation alone to handle, has grown to be an important research topic and inspired new businesses (e.g., Uber, Airbnb). Over the years, crowdsourcing has morphed from providing a platform where workers and tasks can be matched up manually into one which leverages data-driven algorithmic management approaches powered by artificial intelligence (AI) to achieve increasingly sophisticated optimization objectives. In this paper, we provide a survey presenting a unique systematic overview on how AI can empower crowdsourcing - which we refer to as AI-Empowered Crowdsourcing(AIEC). We propose a taxonomy which divides algorithmic crowdsourcing into three major areas: 1) task delegation, 2) motivating workers, and 3) quality control, focusing on the major objectives which need to be accomplished. We discuss the limitations and insights, and curate the challenges of doing research in each of these areas to highlight promising future research directions.
translated by 谷歌翻译
先前的关于自我监督预训练的研究重点是联合培训方案,在该场景中,假定大量未标记的数据一次性地将其作为输入,只有那时才受过培训的学习者。不幸的是,这种问题设置通常是不切实际的,即使不是不可行的,因为许多现实世界的任务依赖于顺序学习,例如,数据是以流方式分散或收集的。在本文中,我们对通过流数据进行了对自我监督的预训练进行了首次彻底而专门的研究,旨在阐明这种被忽视的设置下的模型行为。具体而言,我们在来自ImageNet和域内的四类预训练流数据数据上预先培训超过500个模型,并在三种类型的下游任务和12个不同的下游数据集上对其进行评估。我们的研究表明,以某种方式超出了我们的期望,通过简单的数据重播或参数正则化,顺序的自我监督预训练的预训练证明是联合预训练的有效替代方法,因为前者的性能主要与这些培训相同后者。此外,灾难性的遗忘是顺序监督学习中的一个常见问题,在顺序的自学学习(SSL)中得到了极大的缓解,这是通过我们对损失景观中最小值的表示和敏锐度的全面经验分析来很好地证明的。因此,我们的发现表明,在实践中,对于SSL,可以主要通过顺序学习来代替繁琐的联合培训,这反过来又可以更广泛的潜在应用方案。
translated by 谷歌翻译
With the advanced request to employ a team of robots to perform a task collaboratively, the research community has become increasingly interested in collaborative simultaneous localization and mapping. Unfortunately, existing datasets are limited in the scale and variation of the collaborative trajectories, even though generalization between inter-trajectories among different agents is crucial to the overall viability of collaborative tasks. To help align the research community's contributions with realistic multiagent ordinated SLAM problems, we propose S3E, a large-scale multimodal dataset captured by a fleet of unmanned ground vehicles along four designed collaborative trajectory paradigms. S3E consists of 7 outdoor and 5 indoor sequences that each exceed 200 seconds, consisting of well temporal synchronized and spatial calibrated high-frequency IMU, high-quality stereo camera, and 360 degree LiDAR data. Crucially, our effort exceeds previous attempts regarding dataset size, scene variability, and complexity. It has 4x as much average recording time as the pioneering EuRoC dataset. We also provide careful dataset analysis as well as baselines for collaborative SLAM and single counterparts. Data and more up-to-date details are found at https://github.com/PengYu-Team/S3E.
translated by 谷歌翻译
移动通知已成为社交网络服务的主要通信渠道,以使用户了解和参与。随着越来越多的移动应用程序向用户推出通知,他们不断面临关于发送什么,何时以及如何发送的决定。缺乏研究和方法论通常会导致启发式决策。许多通知到达不适当的时刻或引入太多中断,未能为用户提供价值并激发用户的投诉。在本文中,我们探讨了移动通知和用户参与度之间交互的独特功能。我们提出了一个国家过渡框架,以定量评估通知的有效性。在此框架内,我们开发了一个假设对数线性结构和Weibull分布的徽章通知的生存模型。我们的结果表明,与逻辑回归模型相比,该模型对应用程序的灵活性和卓越的预测准确性具有更大的灵活性。特别是,我们提供了一个在线用例,以进行通知交付时间优化,以显示我们如何做出更好的决策,推动更多用户参与度并为用户提供更多价值。
translated by 谷歌翻译
本文解决了几秒钟学习问题,旨在从几个例子中学习新的视觉概念。在几次拍摄分类中的常见问题设置假设在获取数据标签中的随机采样策略,其在实际应用中效率低下。在这项工作中,我们介绍了一个新的预算感知几秒钟学习问题,不仅旨在学习新的对象类别,还需要选择信息实例来注释以实现数据效率。我们为我们的预算感知几秒钟学习任务开发了一个元学习策略,该任务共同了解基于图形卷积网络(GCN)和基于示例的少量拍摄分类器的新型数据选择策略。我们的选择策略通过图形消息传递计算每个未标记数据的上下文敏感表示,然后用于预测顺序选择的信息性分数。我们在迷你想象网,分层 - 想象项目和omniglot数据集上进行广泛的实验验证我们的方法。结果表明,我们的几次学习策略优于一个相当大的边缘,这表明了我们的方法的功效。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译